Developments of "Lëtzebuergesch" Resources for Automatic Speech Processing and Linguistic Studies
نویسندگان
چکیده
In the present contribution we start with an overview of the linguistic situation of Luxembourg. We then describe specificities of spoken and written Lëtzebuergesch, with respect to automatic speech processing. Multilingual code-switching and code-mixing, poor writing standardization as compared to languages such as English or French, a large diversity of spoken varieties, together with a limited written production of Lëtzebuergesch language contribute to pose many interesting challenges to automatic speech processing both for speech technologies and linguistic studies. Multilingual filtering has been investigated to sort out Luxembourgish from German and French. Word list coverage and language model perplexity results, using sibling resources collected from the WEB, are presented. A phonemic inventory has been adopted for pronunciation dictionary development, a grapheme-phoneme tool has been developed and pronunciation research issues related to the multilingual context are highlighted. Results achieved in resource development allow to envision the realisation of an ASR system.
منابع مشابه
A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملDesigning and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کاملDevelopments of Swahili resources for an automatic speech recognition system
This article describes our efforts to provide ASR resources for Swahili, a Bantu language spoken in a wide area of East Africa. We start with an introduction on the language situation, both at linguistic and digital level. Then, we report the selected strategies to develop a text corpus, a pronunciation dictionary and a speech corpus for this under-resourced language. We explore methodologies a...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملA Review on Speech Corpus Development for Automatic Speech Recognition in Indian Languages
Cini kurian Department of computer Science, Al-Ameen college, Edathala,Aluva, Kerala [email protected] ------------------------------------------------------------------------ABSTRACT--------------------------------------------------------Corpus development gained much attention due to recent statistics based natural language processing. It has new applications in Language Technology, lingui...
متن کامل